ImageGear Professional DLL for Windows
Detecting Tables without Visible Cell Separators

Tables with visible gridlines (gridded tables) in an original page can usually be detected successfully by the auto-zoning function. In contrast, tables without visible cell separators in the original are harder to identify as tables, because they could alternately be word lists or data arranged in columns. The Capture SDK offers an algorithm for detecting such non-gridded tables more confidently. This feature can only be used in conjunction with an auto-zoning step. The non-gridded table detect algorithm is based on the result of the character recognition. This algorithm will run only if all the following conditions are met:

Defining a Zone Manually

C
Copy Code
AT_ERRCOUNT nErrCount;
HIG_REC_IMAGE hImg;
HIGEAR hIGear;
AT_REC_ZONE zone;
nErrCount = IG_load_file("Image.tif", &hIGear );
nErrCount = IG_REC_image_import(hIGear, &hImg);
nErrCount = IG_image_delete(hIGear);
memset(&zone, 0, sizeof(AT_REC_ZONE));
zone.Rect.left = 10;
zone.Rect.top = 20;
zone.Rect.right = 330;
zone.Rect.bottom = 50;
// textual zone: it contains flowed text
zone.Type = IG_REC_WT_FLOW;
 // Omnifont filling method was used to fill the zone's area
zone.FillingMethod = IG_REC_FM_OMNIFONT;
 // MOR recognition module will be applied for the zone's area
zone.RecognitionModule = IG_REC_RM_OMNIFONT_MOR;
 // Character Set: only uppercase letters.
zone.Filter = IG_REC_FILTER_UPPERCASE;
 // Insert the zone to the zone list
nErrCount = IG_REC_zone_insert(hImg, 0, &zone);
// Save zone list
nErrCount = IG_REC_zones_save(hImg, "SAMPLE.ZON");
//...
nErrCount = IG_REC_image_delete(hImg);

To get information about any particular zone of the zone list of the image, the application can invoke the IG_REC_zone_info_get() function. This can be useful to find out more about the zones created by the auto-zoning function.

Zone Manipulation: Updating a Zone

C
Copy Code
AT_ERRCOUNT nErrCount;
HIGEAR hIGear;
HIG_REC_IMAGE hImg;
AT_INT nZoneCount;
AT_REC_ZONE zone;
nErrCount = IG_load_file("Image.tif", &hIGear );
nErrCount = IG_REC_image_import(hIGear, &hImg);
nErrCount = IG_image_delete(hIGear);
nErrCount = IG_REC_zones_load(hImg, "SAMPLE.ZON");
nErrCount = IG_REC_zones_count_get(hImg, &nZoneCount);
 // Get the first zone in the zone list
nErrCount = IG_REC_zone_info_get(hImg, 0, &zone);
 // Adjust the left border of the zone by 10 pixels
zone.Rect.left -= 10;
 // Zone content will be handled as a flow text
zone.Type = IG_REC_WT_FLOW;
nErrCount = IG_REC_zone_info_set(hImg, 0, &zone);
nErrCount = IG_REC_zones_save(hImg, "SAMPLE.ZON");
//...
nErrCount = IG_REC_image_delete(hImg);

When updating a table-type zone with the IG_REC_zone_info_set() function, the "cell-detection" algorithm won't be activated, which will result in improper table detection within the zone.

 

 


©2015. Accusoft Corporation. All Rights Reserved.

Send Feedback